perm filename A08.TEX[106,PHY] blob sn#807711 filedate 1985-11-20 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	\magnification\magstephalf
C00020 ENDMK
CāŠ—;
\magnification\magstephalf
\input macro.tex
\def\today{\ifcase\month\or
  January\or February\or March\or April\or May\or June\or
  July\or August\or September\or October\or November\or December\fi
  \space\number\day, \number\year}
\baselineskip 14pt
\rm
\line{\sevenrm a08.tex[106,phy] \today\hfill}
\line{\copyright July 2, 1984 Robert W. Floyd\hfill}

\def\spa{\tt\char'40 }%use it as {\spa} or else changes everything after into \tt

\bigskip
The character array $\hbox{IMAGE}[0\ldt N+1,0\ldt N+1]$
contains an image made up of asterisks against a background of spaces.
For example:
$$\vbox{\tabskip=0pt\offinterlineskip
\halign{\rt{#}\quad&\vrule#&\strut\quad$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$\quad%
&$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$\quad%
&$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$%
\quad&\vrule#\cr
&\omit&0&1&2&3&4&5&6&7&8&9&10&\omit\cr
&\multispan{13}\hrulefill\cr
&height2pt&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit
&\cr
0&&&&&&&&&&&&&\cr
1&&&&*&*&*&*&*&*&*&&&\cr
2&&&&*&&&&*&*&*&&&\cr
3&&&&&&*&&*&*&&&&\cr
4&&&&&&*&&*&&&&&\cr
5&&&&&*&*&*&*&&&&&\cr
6&&&&&&*&&&*&&*&&\cr
7&&&*&&&*&&*&*&*&*&&\cr
8&&&*&*&*&*&&&&&*&&\cr
9&&&*&&&*&*&*&&&&&\cr
10&&&&&&&&&&&&&\cr
&height2pt&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit
&\cr
&\multispan{13}\hrulefill\cr}}$$
To simplify the boundary conditions, we assume that the outer rows and columns
are blank.

Let's say the image is a digitized rendition of a microphotograph of chromosomes
or protozoa, and we want a program to count them, erasing each from the
image as we count it. We assume that each object in the image is
connected horizontally and vertically; the diagonal from
IMAGE[5,6] to IMAGE[6,7] is not a connection, so there are two objects
in the picture above. We want a subalgorithm which, given the array coordinates
$(R,C)$ of an asterisk, erases the entire object~$F$ to which the asterisk
belongs, changing asterisks into spaces. The reader is invited to design
this subalgorithm. Try your algorithm on the array shown, with $R=5$,
$C=4$. Close the book and come back in an hour or so.

\vfill

\hrule

\vfill

\bigskip
Solution, first version. This method is inefficient, but fairly easy to
show correct. As the algorithm operates, each point will be Black~(`$\ast$'),
Grey~(`--') or 
White~(`{\spa}'). 
Color grey the initial point $P=(R,C)$. The
grey area can be pushed outward, by making grey any black point next to
a grey point. Grey points may be made white if they don't touch any black
points. The underlying idea is that the grey area marks the boundary
between the finished and unfinished portions of the task. After partly
executing the algorithm, the array may look like this:

\eject

$$\vbox{\tabskip=0pt\offinterlineskip
\halign{\rt{#}\quad&\vrule#&\strut\quad$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$\quad%
&$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$\quad%
&$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$%
\quad&\vrule#\cr
&\omit&0&1&2&3&4&5&6&7&8&9&10&\omit\cr
&\multispan{13}\hrulefill\cr
&height2pt&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit
&\cr
0&&&&&&&&&&&&&\cr
1&&&&*&*&*&*&*&*&*&&&\cr
2&&&&*&&&&-&*&*&&&\cr
3&&&&&&&&-&*&&&&\cr
4&&&&&&&&-&&&&&\cr
5&&&&&-&&&&&&&&\cr
6&&&&&&&&&*&&*&&\cr
7&&&*&&&-&&*&*&*&*&&\cr
8&&&*&*&*&-&&&&&*&&\cr
9&&&*&&&-&*&*&&&&&\cr
10&&&&&&&&&&&&&\cr
&height2pt&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit
&\cr
&\multispan{13}\hrulefill\cr}}$$

The algorithm is:

Let $P=(R,C)$. Color $P$ grey.  While the set of grey points is non-empty,
do the following:

\bigskip\noindent
$\cases{\hbox{Let $Q$ be any grey point.}\cr
\hbox{Color grey all black neighbors of $Q$.}\cr
\hbox{Color $Q$ white.}\cr}$

\bigskip\noindent
For example, at the stage shown above, the iterated part of the
algorithm might set $Q=(2,6)$ and change the image to

$$\vbox{\tabskip=0pt\offinterlineskip
\halign{\rt{#}\quad&\vrule#&\strut\quad$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$\quad%
&$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$\quad%
&$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$\quad&$\ctr{#}$%
\quad&\vrule#\cr
&\omit&0&1&2&3&4&5&6&7&8&9&10&\omit\cr
&\multispan{13}\hrulefill\cr
&height2pt&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit
&\cr
0&&&&&&&&&&&&&\cr
1&&&&*&*&*&*&-&*&*&&&\cr
2&&&&*&&&&&-&*&&&\cr
3&&&&&&-&&-&*&&&&\cr
4&&&&&&&&-&&&&&\cr
5&&&&&-&&&&&&&&\cr
6&&&&&&&&&*&&*&&\cr
7&&&*&&&-&&*&*&*&*&&\cr
8&&&*&*&*&-&&&&&*&&\cr
9&&&*&&&-&*&*&&&&&\cr
10&&&&&&&&&&&&&\cr
&height2pt&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit&\omit
&\cr
&\multispan{13}\hrulefill\cr}}$$

If you try the ``algorithm'', you will find empirically that it works.
While it is not a true algorithm, because often there is more than one way
of picking~$Q$, any particular way will work. But why does it work?
How can we be sure it is not a time bomb?
Will it work on figures with holes, figures that surround other figures,
spirals, anything at all?
We can see that it has several invariants.

\smallskip
\display 40pt:(1):
No point is ever darker than its original color, because the only changes
are black to grey and grey to white.

\smallskip
\display 40pt:(2):
As a consequence, the original white background stays white.

\smallskip
\display 40pt:(3):
When a point other than $P$ is made grey, it must already have a grey
neighbor, so there is no way for a first point in a different figure to become
grey. Only points in the figure containing $P$ (call it~$F$) can lighten.

\smallskip
\display 40pt:(4):
The algorithm never creates an adjacent white/black pair that was not
there originally, because it never makes a point black, and only makes one
white when no neighbors are black.

\smallskip
The algorithm must terminate, because the total blackess of the image 
(defined to be the number of grey points plus twice the number of black
points) is a decreasing natural number.

When the algorithm terminates, there are no grey points left. All figures
other than $F$ are unchanged. $P$~itself must be white. Along any
path from~$P$ within~$F$, we find only white points, since we can't
have a white next to a black. Therefore the whole figure has become white.
The number of executions of the iteration is exactly the size of~$F$,
since each execution makes one point white.

As the algorithm is formulated, finding a grey point might require a search
of $N↑2$~points. We can avoid this by keeping a list, called GLIST, which
includes every grey point.

A potentially more efficient subalgorithm, then, is this:

Let $P=(R,C)$. Color $P$ grey, and initialize GLIST to contain just~$P$.
While GLIST is nonempty,

\bigskip\noindent
$\cases{(*1*)&Delete a point $Q$ from GLIST.\cr
&Color $Q$ white.\cr
(*2*)&Color grey all black neighbors of $Q$,\cr
&adding them to GLIST.\cr}$

\bigskip
The useful invariants of this algorithm are:

\smallskip
\display 40pt:(1):
No point is ever darker than its original color, as before.

\smallskip
\display 40pt:(2):
GLIST contains the set of grey points.

\smallskip
\display 40pt:(3):
As before, background stays white, and figures other than $F$ are unchanged.

\smallskip
\display 40pt:(4):
The algorithm never creates a new adjacent white/black pair, as before.

\smallskip
To see that the algorithm terminates, notice that one point becomes white
at each iteration. The number of iterations must be exactly the number of
points in~$F$.

We can use an array {\tt GLIST[1:NN]} to hold GLIST, where {\tt NN} need be no more
than~$N↑2$. Use a natural number variable,~{\tt G}, initially zero, to hold
the size of GLIST. Now the subalgorithm, in Pascal, is

\vfill\eject

%\smallskip
{\obeylines\obeyspaces\let =\ \tt
        G:=1
        GLISTR[1]:=R; GLISTC[1]:=C; IMAGE[R,C]:='-';

        WHILE G>0 DO
            BEGIN
            QR:=GLISTR[G]; QC:=GLISTC[G];
            G:=G-1; IMAGE[QR,QC]:='{\spa}';
            UNBLACKEN(QR-1,QC);
            UNBLACKEN(QR,QC+1);
            UNBLACKEN(QR+1,QC);
            UNBLACKEN(QR,QC-1)
            END
}

\smallskip
\noindent where the procedure {\tt UNBLACKEN(R,C)} has the body

\smallskip
{\obeylines\obeyspaces\let =\ \tt
        IF IMAGE[R,C]='$\ast$' THEN
            BEGIN
            G:=G+1;
            GLISTR[G]:=R; GLISTC[G]:=C;
            IMAGE[R,C]:='$-$'
            END
}

\smallskip
A complete Pascal program to count the figures follows:

\ctrline{**** Fill in.}

\smallskip
\disleft 38pt::
{\bf Exercise:} 

\disleft 38pt::
Change each `--' to `{\spa}' in the above algorithm, and omit
{\tt IMAGE[QR,QC]:=`\spa'}. Does it still work?

\smallskip
Another algorithm for counting the connected figures in an image works by
repeatedly scanning the image from left to right, top to bottom, looking
for southeast corners, local patterns of 
$\vcenter{\halign{\lft{#}\cr $BW$\cr $W$\cr}}$.
When one is found, it is removed or pushed northwest, depending on its
surroundings. In the patterns below, the southeast corner in question
is circled.

\bigskip\bigskip
\disleft 40pt:(1):
$\vcenter{\halign{$\ctr{#}\;$&$\ctr{#}\;$&$\ctr{#}$\cr
&W\cr
W&B&W\cr
&W\cr}}$
\qquad Remove and add one to figure count.

\bigskip\bigskip
\disleft 40pt:(2):
$\vcenter{\halign{$\ctr{#}\;$&$\ctr{#}\;$&$\ctr{#}$\cr
B\cr
&B&W\cr
&W\cr}}$
\qquad Remove.

\bigskip\bigskip
\disleft 40pt:(3):
$\vcenter{\halign{$\ctr{#}\;$&$\ctr{#}\;$&$\ctr{#}$\cr
W&W\cr
&B&W\cr
&W\cr}}$
\qquad or\qquad
$\vcenter{\halign{$\ctr{#}\;$&$\ctr{#}\;$&$\ctr{#}$\cr
W\cr
W&B&W\cr
&W\cr}}$
\qquad Remove.

\bigskip\bigskip
\disleft 40pt:(4):
$\vcenter{\halign{$\ctr{#}\;$&$\ctr{#}\;$&$\ctr{#}$\cr
W&B\cr
B&B&W\cr
&W\cr}}$
\qquad becomes\qquad
$\vcenter{\halign{$\ctr{#}\;$&$\ctr{#}\;$&$\ctr{#}$\cr
B&B\cr
B&W&W\cr
&W\cr}}$

\bigskip\bigskip
\ctrline{Algorithm Movie Number 467}

$$\vcenter{\halign{\ctr{#}\xskip
&\ctr{#}\xskip
&\ctr{#}\xskip
&\ctr{#}\qquad\quad
&\ctr{#}\xskip
&\ctr{#}\xskip
&\ctr{#}\xskip
&\ctr{#}\quad\qquad
&\ctr{#}\xskip
&\ctr{#}\xskip
&\ctr{#}\xskip
&\ctr{#}\quad\qquad
&\ctr{#}\xskip
&\ctr{#}\xskip
&\ctr{#}\xskip
&\ctr{#}\quad\qquad
&\ctr{#}\xskip
&\ctr{#}\xskip
&\ctr{#}\quad\qquad
&\ctr{#}\xskip
&\ctr{#}\quad\qquad
&\ctr{#}\cr
*&*&*&*
&*&*&*&*
&*&*&*&*
&*&*&*&*
&*&*&*
&*&*&*\cr
*&&&*
&*&&&*
&*&&*&*
&*&*&*&
&*&*&
&*\cr
*&&&*
&*&&*&*
&*&*&*&
&*&*&&
&*\cr
*&*&*&*
&*&*&*&
&*&*&&
&*\cr}}$$

\bigskip
To show that this algorithm is correct, you need to show that the number
of figures plus the count of already erased figures is invariant. It is not
obvious that operation~(4) does not merge previously separate figures. Can that
happen?

You also need to show termination, of course. The algorithm stops if it scans the
entire image without finding a southeast corner. Find a natural number formula
that decreases whenever a southeast corner is found.

How many steps can this algorithm require, say on a 100~by~100 array? How
does that compare to the first algorithm?

\vfill\eject\end